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Abstract — We introduce a simple framework for learning 
aggressive maneuvers in flight control of UAVs. Having in- 
spired from biological environment, dynamic movement prim- 
itives are analyzed and extended using nonlinear contraction 
theory. Accordingly, primitives of an observed movement are 
stably combined and concatenated. We demonstrate our results 
experimentally on the Quanser Helicopter, in which we first 
imitate aggressive maneuvers and then use them as primitives 
to achieve new maneuvers that can fly over an obstacle. 

I. Introduction 

The role of UAVs (Unmanned Aerial Vehicles) has gained 
significant importance in the last decades. They have many 
advantages (agility, low surface area, ability to work in con- 
strained or dangerous places) over their conventional prece- 
dents. In addition, current UAVs are more biologically-inspired 
in terms of shape and performance because of the improve- 
ments in electronics and propulsion. Unfortunately, we are 
still far away from using their capacity at the fullest. This is 
mostly related with the weakness of current control algorithms 
against high-dimensional and nonlinear environments. In this 
sense, generating aggressive maneuvers is interesting and hard 
to accomplish. 

In this paper, our approach to solve this issue is designed 
in view of the experiments on frogs and monkeys which 
suggest that we are faced with an inverse-kinematics algorithm 
that adapts to the environment and changes in a sequence of 
target points irrespective of the initial conditions. In theory, we 
analyzed dynamic movement primitives (DMPs)| 18| and com- 
bined them using contraction theory. In experiments, obstacle 
avoidance DM? of a human-piloted flight data is segmented 
into parts and combined at different initial points to achieve 
maneuvers against different obstacles on different locations. 
Background of our work is briefly detailed below. 

A. Background 

1) Imitation Learning: "By three methods we may learn 
wisdom: first, by reflection, which is noblest; second, by 
imitation, which is easiest; and third, by experience, which 
is the most bitter." (Confucius) Imitation takes place when 
an agent learns a behavior by observing the execution of 
that behavior from a teacher fTT^. Imitation is not inherent 
to humans. It is also observed in animals. For example, 
experiments show that kittens exposed to adult cats manipulate 
levers to retrieve food much faster than the control group L28J . 



There has been a number of applications on imitation 
learning in the field of robotics. Studies on locomotion f5|, 
L6J, 134], humanoid robots Q, ||3,|l29J, |27J, and human-robot 
interactions ll33l . ll20l have used imitation learning or move- 
ment primitives. The emphasis on these studies is on primitive 
derivation and movement classification [30J; combinations of 
the primitives JH], |[l6l, |l22l, |l23]. El, ED and primitive 
models lH?), Ull, 103, 1021 in order to extract behaviors. 

2) Aggressive Maneuvers: Aggressive control of au- 
tonomous helicopters represents a challenging problem for 
engineers. The challenge owes itself to the highly nonlinear 
and unstable nature of the dynamics along with the nonlinear 
relations for actuator saturation. Nevertheless, we can find 
successful unmanned helicopter examples H, il35l . O], (Ol, 
1371, f3S\, (39), go), gD in the literature. However, model 
helicopters controlled by humans can achieve considerably 
more complex and aggressive maneuvers compared to that can 
be done autonomously with the state of the art. In f36l|, it is 
observed that after several repetitions of the same maneuver, 
performed by a human, generated trajectories are similar and 
the control inputs are well-structured and repetitive. Hence, it 
is intuitive to focus on understanding human's maneuvers to 
find proper algorithms for unmanned control. 

3) Biological Motivation: In their experiment with deaf- 
ferented and intact monkeys, Bizzi [24] found that a certain 
movement can be executed regardless of initial conditions, 
emphasizing the importance of feedback control. In particular, 
they have shown that the control variable is the equilibrium 
state of the agonist and antagonist muscles. Same experimental 
setup is again used to characterize the trajectory of the motion 
in (251 • Their results additionally suggest that movement called 
"virtual trajectory" is composed of more than one equilibrium 
point and central nervous system uses the stability of the 
lower level of the motor system to simplify the generation 
of movement primitives||25|. 

Bizzi ifSTl and Mussa-Ivaldi |26|'s experiments on frogs 
provide us with further clues in understanding movement 
primitives. They microstimulated spinal cord and measured the 
forces at the ankle. Having repeated this process with ankle 
replaced at nine to 16 locations, they observed that collection 
of measured forces always converges to a single equilibrium 
point. In their model, inverse kinematics plays a crucial role 
in achieving the endpoint trajectory (see Mussa-Ivaldi |3^). 



II. Analysis of DMP 

This section outlines the analysis of the DMP algorithm 
using contraction theory. 

A. DMP Algorithm 

DMP is a trajectory generation algorithm which interpolates 
between the start and end points of a path based on learning. 
The system can be represented by 



Tz = a,XPz{g -y)- z) 
Ty^z + f, 



(1) 
(2) 



where y, y and y characterize the desired trajectory, az and 
Pz are time constants, t is a temporal scaling factor, g is the 
desired end point. In addition, the canonical system is given 
by 



TV = a{l3z{g -x)-v) 



TX 



(3) 

(4) 



In general, assuming that the /-function is zero, system will 
converge to g exponentially. The goal of the DMP algorithm 
is to modify this exponential path so that the /-function 
makes the system non-Unear and allows us to generate desired 
trajectories between the origin and the g point. 

The /-function is a normalized linear combination of Gaus- 
sians which helps to approximate the final trajectory, i.e. it has 
the general form 



f{x,v,g) 



where 



= exp{-hi{x/g - Cif}. 



(5) 



(6) 



B. Rhythmic DMPs 

The DMP algorithm can also be extended to the rhythmic 
movements |46| by changing the canonical system with the 
following: 



C. Learning of primitives using DMPs 

Learning aspect of the algorithm comes into play with the 
computation of the weights (wi) of the Gaussians. Weights 
are derived from Eq[T] and Eq|2] using the training trajectory 
ydemo and ydemo as variables . Once the parameters of the /- 
function are learned, then DMP can simply be used to generate 
the original trajectory. As detailed below, spatial and temporal 
shifts are achieved by adjusting the g and r respectively. 

• Spatial adjustments: The first system [Eq.([T]l, Eq.(|2]l] 
can be seen as a linear system. It is due to the fact 
that variable v in /-function is only multiplied by time- 
varying constant. Hence, we can say that output (y) is 
simply scaled by g from superposition. 

• Temporal adjustments: The second system [(Eq.© 
Eq.lHli] is simply linear In addition, f function is linear 
because the multiplier is a time-varying constant, 
temporally scaled by r. Thus, from linearity, we can say 
that temporal adjustments of the whole system is carried 
out by just changing the variable r. 

These arguments can also be extended to the rhythmic DMPs 
for modulations. 

D. Analysis of DMP Using Contraction Theory 

The basic theorem of contraction analysis fl4| is stated as 
Theorem (Contraction)Coni/(ier the deterministic system 

i = f{x,t) (13) 

where f is a smooth nonlinear function. If there exist a 
uniformly invertible matrix associated generalized Jacobian 
matrix 



F= (e 



(14) 



is uniformly negative definite, then all system trajectories 
converge exponentially to a single trajectory, with convergence 
rate \Xmax\, where Xmaxis the largest eigenvalue of the 
symmetric part of F. The system is said to be contracting. 



= 1 



M > 



(7) 
(8) 



where corresponds to x in Eq. [3] as a temporal variable. 
Similar to the discrete system, control policy: 



Ty 

f{x,v,g) 

4, 



CtziPziVr, 
Z + f 



T ~ 



exp{hi{cos{4> - Ci) - 1)} 



(9) 
(10) 

(11) 
(12) 



where ym is a basis point for learning and v ^ [x ^ 
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Basically, a nonlinear time-varying dynamic system is called 
contracting if initial conditions or temporary disturbances are 
forgotten exponentially fast, i.e., if trajectories of the perturbed 
system return to their nominal behavior with an exponential 
convergence rate. It turns out that relatively simple conditions 
can be given for this stability-like property to be verified. 
Furthermore this property is preserved through basic system 
combinations, such as parallel combinations, feedback com- 
binations, and series or hierarchies, yielding simple tools for 
modular design. For linear time-invariant systems, contraction 
is equivalent to strict stability. 

Consider a system 



(15) 



one-way coupling 



where zi and Z2 represent the first and the second system 
of DMP and the Szi represent associated differential dis- 
placements (see [14]) . Equation ( fTsT i display a hierarchy of 
contracting systems, and furthermore since F21 is bounded 
by construction of /, the whole system globally exponentially 
converges to a single trajectory fT?|. 

We can also extend the hierarchical contraction property 
to the rhythmic DMPs, since the canonical system, which is 
shown below is contracting. 



Tx = —fi{x — xq) — y /i > 



(16) 
(17) 



Although the system will eventually contract to the g point, 
there will be a time delay due to the hierarchy between second 
and the first system. We can decrease this delay by increasing 
the number of weights in our equation. 

Using contraction theory, stability of the DMPs can be 
analyzed. Once the original trajectory is mapped into the DMP, 
the system behaves linearly for a given input-output relation as 
shown before. Moreover, contraction property guarantees the 
convergence into a single trajectory. From linearity, it is easy 
to show that learning the trajectories is not constrained by the 
stationary goal points that do not have a velocity components, 
which are required for equilibrium points in virtual trajectories. 

III. Coupling of DMPs Using Contraction Theory 

In this section, we use partial contraction theory ifTSl 
to couple DMPs. One-way coupling configuration of con- 
traction theory allows a system to converge to its coupled 
pair smoothly. Theory for the one-way coupling states the 
following two systems: 



X2 = f{x2, t) + u{xi) - U{X2) 



(18) 
(19) 



In a given formula, if / — u is contracting, then X2 — > xi 
from any initial condition. 

A typical example for one way coupling is an observer 
design while the first system represents the real plant and the 
second system represents the mathematical model of the first 
system. The states of the second system will converge to the 
states of the first system and result in the robust estimation 
of the real system states. However, for our experiments, we 
interpret contraction as to imitate the transition between two 
states. It will be shown in section IV how the end of one 
trajectory becomes the initial condition of the second trajectory 
and contraction accomplishes the smooth transition. 

In DMPs, we couple the two systems using the following 
equations: 

yi = .91 -yi- yi + f{yi) (20) 

m = 92 - y2 - m + u{yi) - u{y2) (21) 
u{xi) = gi + f{xi) (22) 




Fig. 1. One-way coupling of a rhythmic DMPs 



= exp{-hi{x/g - q)^} 



(23) 
(24) 



A toy example of the equations listed above can be seen in 
Fig.[T] In this setting, y2 is the first trajectory primitive, which 
contracts to yi - the second trajectory primitive. 

One-way coupling has many advantages as a method over 
its precedents: 

In ll42]| . trajectories are achieved by simply stretching the 
original trajectory in its coordinates and there is a direct 
relation between initial and end points. Also, there are dis- 
continuities in terms of derivatives of the trajectory at the 
transition regions between primitives. Giese B3l solves the 
problem of discontinuities by first taking the derivatives of the 
original trajectories, then combining the derivatives, and finally 
integrating them again using initial conditions. However, this 
method adversely affects the accuracy of the trajectories. 
Hence, our method improves on |42 j and [43 1 by generating 
more accurate trajectories independent of initial points. 

In f2\, snapshots of the pilot's maneuvers are taken and 
evaluated as noisy measurements of hidden and true trajectory. 
In their model, time indexes are used for the comparison of 
expert's demonstrations. Maximization of the joint likelihood 
of demonstrations are achieved through trajectory learning 
algorithms. As was done in HI, Locally Weighted Learning 
is used for learning system dynamics close to trajectories. 
Moreover, desired trajectories are supervised by adding in- 
formation specific to each maneuver. With the help of feasible 
trajectory, optimal controller and system dynamics along the 
maneuver, they achieved remarkable results on model heli- 
copters. However, finding hidden trajectory requires notewor- 
thy computational performance where they smooth out data to 
emphasize the similarities. In addition, their algorithm applies 
only for mimicking demonstrations. In our algorithm, learning 
the hidden and true trajectory of maneuvers can simply be 



done by comparing the weights of DMPs (see fTSl). It is 
also easier to manipulate DMPs by changing parameters (r 
and g) for new challenges. Moreover, our method lies on the 
background of biological experiments in such a way that it is 
adaptable for further research. 

In general, we summarize the advantages for using dynam- 
ical systems as control policies as follows: 

• It is easy to incorporate perturbations to dynamical sys- 
tems. 

• It is easy to represent the primitives. 

• Convergence to the goal position is guaranteed due to the 
attractor dynamics of DMP. 

• It is easy to modify for different tasks. 

• At the transition regions, discontinuities are avoided. 

• Partial contraction theory forces the coupling from any 
initial condition. 

Also in ifTSl . Schaal's suggested system is driven between 
stationary points. However, biological experiments suggest 
that we are faced with a "virtual trajectory" composed of 
equilibrium points that has velocity components. For this 
reason, we showed that we can achieve this property by 
combining nonconstant points. 

IV. Experiments on Helicopter 
Here, we apply the motion primitives on the helicopter. 

A. Experimental Setup 

We used Quanser Helicopter (see Figure |2|l in our exper- 
iments. The helicopter is an under-actuated system having 
two propellers at the end of the arm. Two DC motors are 
mounted below the propellers to create the forces which 
drive propellers. The motors' axes are parallel and their 
thrust is vertical to the propellers. We have three degrees of 
freedom (DOF): pitch (vertical movement of the propellers), 
roll (circular movement around the axis of the propellers) and 
travel (movement around the vertical base) in contrast with 
conventional heUcopters with six degrees of freedom. 




Fig. 2. Transverse momentum distributions. (9) 

In system model||9l, the origin of our coordinate system is 
at the bearing and slip-ring assembly. The combinations of 
actuators form the collective {Tcoi — Tj^ + Tfi) and cyclic 
{Tcyc — Tl — Tff) forces which are used as inputs in our 
controller. The schematics of helicopter are shown in Figures 
ElandE] 

Let Jxx, Jyy, and J^^ denote the moment of inertia of 
our system dynamics. For simplicity, we ignore the products 




Fig. 3. Schematic of the 3DOF Fig. 4. Top view. (9l 

helicopter.|9 1 



of inertia terms. The equations of motion are as follows (cf. 
Ishutkina ID): 

M^iTL+TR)Lcosie) siii(0), 

-{Tl - TR)lh sin(0) sin(0) - Drag, 

Jyy'e = -Mgle sin(0 + ^o) + {Tl + Tr)L cos{(j)), 

Jxx4> = -mgl^sm{(j)) + (Tl - T]i)lh, 

where 

• M is the total mass of the heUcopter assembly, 

• m is the mass of the rotor assembly, 

• L is the length of the main beam from the slip-ring pivot 
to the rotor assembly, 

• tp, 9, (p are travel, pitch and roll angles respectively. 

• Ifi is the distance from the rotor pivot to each of the 
propellers, 

. Drag = ip(^L)2(5o + sin(0))L, 

• 5*0 and Sq are the effective drag coefficients times the 
reference area and p is the density of air 

It can be seen that the above system is nonlinear in the 
states, but linear in terms of control inputs. In practice, we 
used feedback linearization with bounded internal dynamics 
(see Bayraktar [12]) for a 3DOF helicopter, which tracks 
trajectories in elevation and travel. 

B. Simulation & Experimental Results 

In this section, we first describe our numerical simulation 
of the proposed primitive framework. Second, we describe our 
actual experiment on the Quanser Helicopter 

1) Trajectory Generation: In experimental setup, we used 
an operator with a joystick to create aggressive trajectories to 
pass an obstacle. However, generating aggressive trajectories 
with the joystick is a difficult task even for the operator. 
Therefore, we designed an augmented control for the joystick 
to enhance the performance of the helicopter. In detail, we 
used "up" and "down" movements of the joystick to increase 
or decrease the Tcoi that is appUed to the actuators. For the 
"right" and "left" movements of the joystick, we preferred to 
control the roll angle using PD control. 

In the original maneuver, the obstacle's distance and the 
highest point are in the coordinates where ijj and 9 angles 
are 220 and 60 respectively and the helicopter stops at the 
coordinates where ip = 28, 9 — 317 and (p = —17 (see Figure 

E). 



y Sl y demo 



yd £l yd demo ydd &. ydd demo 





5 5 

time (sec) time (sec) 

parameter x parameter v 

r 



40 




20 




□ 




-20 




-40 





5 
time (sec) 
parameter z 



Time (sec) 




5 
time (sec) 



5 
time (sec) 



Fig. 5. Original maneuver acliieved by an operator 



Fig. 7. Trajectory generation for tlie second primitive - pitcli 



2) Trajectory Learning: From several demonstrations, it is 
observed that our operator follows two distinct pattern to carry 
out the maneuver. Accordingly, these two patterns suggest an 
equilibrium point at the top of the obstacle. Therefore, to fly 
over different obstacles, the acquired primitive is segmented 
into two primitives at the highest pitch angle. Fig. |6]and Fig. 
|7] show the results of DMP algorithm for the pitch angle. The 
top left graphs are results for pitch angles, where green lines 
represent the operator input for the trajectories and blue lines 
represent the fittings that the DMP computes for different start 
and end points. Hence, desired trajectories in these graphs are 
not on top of the trajectories generated by the operator Other 
graphs show the time evolution of the DMP parameters. 
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Fig. 6. Trajectory generation for the first primitive - pitch 

3) Synchronization of primitives: The two primitives cre- 
ated in the previous sections are defined as trajectories between 
certain start and end points. However, the end point of the first 



trajectory does not necessarily matches with the starting point 
of the second trajectory. We use partial contraction theory [131 
to force the first trajectory to converge to the second one. 
However, since we want to use the contraction as a transition 
between two trajectories, coupling is enabled towards the end 
of first primitive. Figure [8] shows how the two trajectories 
evolve in time. In the first primitive, the goal positions of 
and 6 angles are changed to 150° and 50° respectively, where 
original angles are ijj = 220° and — 60°. In the second 
primitive, the goal position of the ip angle is changed from 
317° to 300°. 




Fig. 8. Time evolution of primitive-1 and primitive-2 merged. 

4) Experiments on the Helicopter: Tracking performance 
of the helicopter is shown in Figure |9] It is seen that the 
helicopter followed the desired (ij) and 9) angles almost 
perfectly. However, the trajectory of the roll angle is a bit 



different than the desired since we control two parameters (tp 
and 9) and the goal positions of the DMPs are different. But 
we should highlight the fact that two roll trajectories follow 
the same pattern. In figure, the last part of the roll trajectory 
manifests an oscillation which can be prevented by roll control, 
since the other parameters are almost constant. The tracking 
performance can further be improved by applying discrete 
nonlinear observers to get better velocity and acceleration 
values. Figure [To] shows snapshots of the maneuver 




Fig. 9. Tracking performance of the helicopter. 
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Fig. 10. Snapshots of the obstacle avoidance maneuver. 

V. Extensions of DMP 

A. Dynamical System with First-Order Filters 

DMP algorithm can be improved by replacing the first 
system with the equations shown below: 

Ty + aiy (25) 
Tx + a2X = g + f (26) 

which is equivalent of 

y + T {ai + a2) y + aia2 y ^ g + f (27) 



By introducing two first-order filters, we guarantee the stability 
of the system against time varying parameters like r(i) or g{t) 
. Since the system is linear without the /-function (EqfTTTl. we 
achieve learning and modulation properties of DMP using the 
/ in either Eq.(l25]l or Eq.(l26]l. For further applications, we 
will use this model to generate primitives for time-varying 
goal points. 

B. Generating New Primitives 

Experiments on frog's spinal cord lISTl . ll26l . ll47l sug- 
gest that movement primitives can be generated from linear 
combinations of vectorial force fields which lead the limb 
of a frog to the virtual equilibrium points. In ll47l . it is 
also pointed out that vectorial summation of two force fields 
with different equilibrium points generate a new force field 
whose equilibrium point is at intermediate location of the 
original equilibrium points. In this perspective, we will use 
two methods to generate new primitives. 

1) Two-way Synchronization of DMPs: Consider a system 

yi^f{yi,t)+K{u{y2)~u{y,)) (28) 
y2 = f{y2,t) + K{u{yi) ~ u{y2)) K>0 (29) 

Where yi and 1/2 represent the first and the second primitive 
respectively. From partial contraction theory, we say that 
yi and 2/2 converge together exponentially, if / — 2Ku is 
contracting. Since DMPs are aheady contracting, we achieve 
synchronization using contracting inputs. In FiglTT](top), new 
primitive is a linear combination of sine and cosine primi- 
tives. Also in the same figure, coupling forces accounts for 
oscillations before synchronization happens. 




Fig. 11. Top:Synchronization of sine and cosine primitives. Bottom: New 
primitive generated by the linear combinations of weights from sine and cosine 
primitives 

2) Combination of Primitives using Weights: In DMPs, as it 
was shown before, system behaves linearly and superposition 
applies. Therefore, in the /-function , linear combination of 
the weights from different primitives produce linear combi- 
nation of primitives. For rhythmic DMPs, as an example. 



we combine the weights of the sine and cosine primitives 
{■Wnew = 0.5wsine + 0.5wcosine) to generate a new primitive 
(See Fig. [TT| (bottom)). However for a regular DMP, we 
can not achieve the desired trajectories although we have 
linearity which is because input "g" point is not compatible 
with the weights changing with respect to the couplings. For 
this reason, we will simply modify the equations in our later 
research. 

VI. Conclusion 

In this paper, we use a novel approach, inspired by biolog- 
ical experiments and humanoid robotics, which uses control 
primitives to imitate the data taken from human-performed ob- 
stacle avoidance maneuver In our model, DMP computes the 
trajectory dynamics so that we can generate complex primitive 
trajectories for given different start and end points, while one- 
way coupling ensures smooth transitions between primitives 
at the equilibrium points. We demonstrate our algorithm with 
an experiment. We generate a complex, aggressive maneuver, 
which our helicopter could follow within a given error bound 
with a desired speed. Future research will be conducted on 
different combinations of primitives using partial contraction 
theory. We expect these techniques to be particularly useful 
when the system dynamic models are very coarse, as e.g. 
in the case of flapping wing systems and new bio-inspired 
underwater vehicles. 
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Abstract — We introduce a simple framework for learning 
aggressive maneuvers in flight control of UAVs. Having in- 
spired from biological environment, dynamic movement prim- 
itives are analyzed and extended using nonlinear contraction 
theory. Accordingly, primitives of an observed movement are 
stably combined and concatenated. We demonstrate our results 
experimentally on the Quanser Helicopter, in which we first 
imitate aggressive maneuvers and then use them as primitives 
to achieve new maneuvers that can fly over an obstacle. 

I. Introduction 

The role of UAVs (Unmanned Aerial Vehicles) has gained 
significant importance in the last decades. They have many 
advantages (agility, low surface area, ability to work in con- 
strained or dangerous places) over their conventional prece- 
dents. In addition, current UAVs are more biologically-inspired 
in terms of shape and performance because of the improve- 
ments in electronics and propulsion. Unfortunately, we are still 
far away from using their capacity at the fullest. This is is 
mostly related with the weakness of current control algorithms 
against high-dimensional and nonlinear environments. In this 
sense, generating aggressive maneuvers is interesting and hard 
to accomplish. 

In this paper, our approach to solve this issue is designed 
in view of the experiments on frogs and monkeys which 
suggest that we are faced with an inverse-kinematics algorithm 
that adapts to the environment and changes in a sequence of 
target points irrespective of the initial conditions. In theory, we 
analyzed dynamic movement primitives (DMPs)| 18| and com- 
bined them using contraction theory. In experiments, obstacle 
avoidance DM? of a human-piloted flight data is segmented 
into parts and combined at different initial points to achieve 
maneuvers against different obstacles on different locations. 
Background of our work is briefly detailed below. 

A. Background 

1) Imitation Learning: "By three methods we may learn 
wisdom: first, by reflection, which is noblest; second, by 
imitation, which is easiest; and third, by experience, which 
is the most bitter." (Confucius) Imitation takes place when 
an agent learns a behavior by observing the execution of 
that behavior from a teacher fTT^. Imitation is not inherent 
to humans. It is also observed in animals. For example, 
experiments show that kittens exposed to adult cats manipulate 
levers to retrieve food much faster than the control group L28J . 



There has been a number of applications on imitation 
learning in the field of robotics. Studies on locomotion f5|, 
L6J, 134], humanoid robots Q, ||3,|l29J, |27J, and human-robot 
interactions ll33l . ll20l have used imitation learning or move- 
ment primitives. The emphasis on these studies is on primitive 
derivation and movement classification [30J; combinations of 
the primitives JH], |[l6l, |l22l, |l23]. El, ED and primitive 
models lH?), Ull, 103, 1021 in order to extract behaviors. 

2) Aggressive Maneuvers: Aggressive control of au- 
tonomous helicopters represents a challenging problem for 
engineers. The challenge owes itself to the highly nonlinear 
and unstable nature of the dynamics along with the nonlinear 
relations for actuator saturation. Nevertheless, we can find 
successful unmanned helicopter examples H, il35l . O], (Ol, 
|f37l, f3S\, (39), go), gD in the literature. However, model 
helicopters controlled by humans can achieve considerably 
more complex and aggressive maneuvers compared to that can 
be done autonomously with the state of the art. In f36l|, it is 
observed that after several repetitions of the same maneuver, 
performed by a human, generated trajectories are similar and 
the control inputs are well-structured and repetitive. Hence, it 
is intuitive to focus on understanding human's maneuvers to 
find proper algorithms for unmanned control. 

3) Biological Motivation: In their experiment with deaf- 
ferented and intact monkeys, Bizzi [24] found that a certain 
movement can be executed regardless of initial conditions, 
emphasizing the importance of feedback control. In particular, 
they have shown that the control variable is the equilibrium 
state of the agonist and antagonist muscles. Same experimental 
setup is again used to characterize the trajectory of the motion 
in (251 • Their results additionally suggest that movement called 
"virtual trajectory" is composed of more than one equilibrium 
point and central nervous system uses the stability of the 
lower level of the motor system to simplify the generation 
of movement primitives||25|. 

Bizzi ifSTl and Mussa-Ivaldi |26|'s experiments on frogs 
provide us with further clues in understanding movement 
primitives. They microstimulated spinal cord and measured the 
forces at the ankle. Having repeated this process with ankle 
replaced at nine to 16 locations, they observed that collection 
of measured forces always converges to a single equilibrium 
point. In their model, inverse kinematics plays a crucial role 
in achieving the endpoint trajectory (see Mussa-Ivaldi |3^). 



II. Analysis of DMP 

This section outlines the analysis of the DMP algorithm 
using contraction theory. 

A. DMP Algorithm 

DMP is a trajectory generation algorithm which interpolates 
between the start and end points of a path based on learning. 
The system can be represented by 



Tz = a,XPz{g -y)- z) 
Ty^z + f, 



(1) 
(2) 



where y, y and y characterize the desired trajectory, az and 
Pz are time constants, t is a temporal scaling factor, g is the 
desired end point. In addition, the canonical system is given 
by 



TV = a{l3z{g -x)-v) 



TX 



(3) 

(4) 



In general, assuming that the /-function is zero, system will 
converge to g exponentially. The goal of the DMP algorithm 
is to modify this exponential path so that the /-function 
makes the system non-Unear and allows us to generate desired 
trajectories between the origin and the g point. 

The /-function is a normalized linear combination of Gaus- 
sians which helps to approximate the final trajectory, i.e. it has 
the general form 



f{x,v,g) 



where 



= exp{-hi{x/g - Cif}. 



(5) 



(6) 



B. Rhythmic DMPs 

The DMP algorithm can also be extended to the rhythmic 
movements |46| by changing the canonical system with the 
following: 



C. Learning of primitives using DMPs 

Learning aspect of the algorithm comes into play with the 
computation of the weights (wi) of the Gaussians. Weights 
are derived from Eq[T] and Eq|2] using the training trajectory 
ydemo and ydemo as variables . Once the parameters of the /- 
function are learned, then DMP can simply be used to generate 
the original trajectory. As detailed below, spatial and temporal 
shifts are achieved by adjusting the g and r respectively. 

• Spatial adjustments: The first system [Eq.([T]l, Eq.(|2]l] 
can be seen as a linear system. It is due to the fact 
that variable v in /-function is only multiplied by time- 
varying constant. Hence, we can say that output (y) is 
simply scaled by g from superposition. 

• Temporal adjustments: The second system [(Eq.© 
Eq.lHli] is simply linear In addition, f function is linear 
because the multiplier is a time-varying constant, 
temporally scaled by r. Thus, from linearity, we can say 
that temporal adjustments of the whole system is carried 
out by just changing the variable r. 

These arguments can also be extended to the rhythmic DMPs 
for modulations. 

D. Analysis of DMP Using Contraction Theory 

The basic theorem of contraction analysis fl4| is stated as 
Theorem (Contraction)Coni/(ier the deterministic system 

i = f{x,t) (13) 

where f is a smooth nonlinear function. If there exist a 
uniformly invertible matrix associated generalized Jacobian 
matrix 



F= (e 



(14) 



is uniformly negative definite, then all system trajectories 
converge exponentially to a single trajectory, with convergence 
rate \Xmax\, where Xmaxis the largest eigenvalue of the 
symmetric part of F. The system is said to be contracting. 



= 1 



M > 



(7) 
(8) 



where corresponds to x in Eq. [3] as a temporal variable. 
Similar to the discrete system, control policy: 
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4, 



CtziPziVr, 
Z + f 



T ~ 



exp{hi{cos{4> - Ci) - 1)} 
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(10) 

(11) 
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where ym is a basis point for learning and v ^ [x ^ 
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Basically, a nonlinear time-varying dynamic system is called 
contracting if initial conditions or temporary disturbances are 
forgotten exponentially fast, i.e., if trajectories of the perturbed 
system return to their nominal behavior with an exponential 
convergence rate. It turns out that relatively simple conditions 
can be given for this stability-like property to be verified. 
Furthermore this property is preserved through basic system 
combinations, such as parallel combinations, feedback com- 
binations, and series or hierarchies, yielding simple tools for 
modular design. For linear time-invariant systems, contraction 
is equivalent to strict stability. 

Consider a system 



(15) 



one-way coupling 



where zi and Z2 represent the first and the second system 
of DMP and the Szi represent associated differential dis- 
placements (see [14]) . Equation ( fTsT i display a hierarchy of 
contracting systems, and furthermore since F21 is bounded 
by construction of /, the whole system globally exponentially 
converges to a single trajectory fT?|. 

We can also extend the hierarchical contraction property 
to the rhythmic DMPs, since the canonical system, which is 
shown below is contracting. 



Tx = —fi{x — xq) — y /i > 



(16) 
(17) 



Although the system will eventually contract to the g point, 
there will be a time delay due to the hierarchy between second 
and the first system. We can decrease this delay by increasing 
the number of weights in our equation. 

Using contraction theory, stability of the DMPs can be 
analyzed. Once the original trajectory is mapped into the DMP, 
the system behaves linearly for a given input-output relation as 
shown before. Moreover, contraction property guarantees the 
convergence into a single trajectory. From linearity, it is easy 
to show that learning the trajectories is not constrained by the 
stationary goal points that do not have a velocity components, 
which are required for equilibrium points in virtual trajectories. 

III. Coupling of DMPs Using Contraction Theory 

In this section, we use partial contraction theory ifTSl 
to couple DMPs. One-way coupling configuration of con- 
traction theory allows a system to converge to its coupled 
pair smoothly. Theory for the one-way coupling states the 
following two systems: 



X2 = f{x2, t) + u{xi) - U{X2) 



(18) 
(19) 



In a given formula, if / — u is contracting, then X2 — > xi 
from any initial condition. 

A typical example for one way coupling is an observer 
design while the first system represents the real plant and the 
second system represents the mathematical model of the first 
system. The states of the second system will converge to the 
states of the first system and result in the robust estimation 
of the real system states. However, for our experiments, we 
interpret contraction as to imitate the transition between two 
states. It will be shown in section IV how the end of one 
trajectory becomes the initial condition of the second trajectory 
and contraction accomplishes the smooth transition. 

In DMPs, we couple the two systems using the following 
equations: 

yi = .91 -yi- yi + f{yi) (20) 

m = 92 - y2 - m + u{yi) - u{y2) (21) 
u{xi) = gi + f{xi) (22) 




Fig. 1. One-way coupling of a rhythmic DMPs 



= exp{-hi{x/g - q)^} 



(23) 
(24) 



A toy example of the equations listed above can be seen in 
Fig.[T] In this setting, y2 is the first trajectory primitive, which 
contracts to yi - the second trajectory primitive. 

One-way coupling has many advantages as a method over 
its precedents: 

In ll42]| . trajectories are achieved by simply stretching the 
original trajectory in its coordinates and there is a direct 
relation between initial and end points. Also, there are dis- 
continuities in terms of derivatives of the trajectory at the 
transition regions between primitives. Giese B3l solves the 
problem of discontinuities by first taking the derivatives of the 
original trajectories, then combining the derivatives, and finally 
integrating them again using initial conditions. However, this 
method adversely affects the accuracy of the trajectories. 
Hence, our method improves on |42 j and [43 1 by generating 
more accurate trajectories independent of initial points. 

In f2\, snapshots of the pilot's maneuvers are taken and 
evaluated as noisy measurements of hidden and true trajectory. 
In their model, time indexes are used for the comparison of 
expert's demonstrations. Maximization of the joint likelihood 
of demonstrations are achieved through trajectory learning 
algorithms. As was done in HI, Locally Weighted Learning 
is used for learning system dynamics close to trajectories. 
Moreover, desired trajectories are supervised by adding infor- 
mation specific to each maneuver. As a result, with the help 
of feasible trajectory, optimal controller and system dynamics 
along the maneuver, they achieved remarkable results on 
model helicopters. However, finding hidden trajectory requires 
noteworthy computational performance where they smooth out 
data to emphasize the similarities. In addition, their algorithm 
applies only for mimicking demonstrations. In our algorithm, 
learning the hidden and true trajectory of maneuvers can 



simply be done by comparing the weights of DMPs (see ifTSl ). 
It also is easier to manipulate DMPs by changing parameters 
(r and g) for new challenges. Moreover, our method lies on 
the background of biological experiments in such a way that 
it is adaptable for further research. 

In general, we summarize the advantages for using dynam- 
ical systems as control policies as follows: 

• It is easy to incorporate perturbations to dynamical sys- 
tems. 

• It is easy to represent the primitives. 

• Convergence to the goal position is guaranteed due to the 
attractor dynamics of DMP. 

• It is easy to modify for different tasks. 

• At the transition regions, discontinuities are avoided. 

• Partial contraction theory forces the coupling from any 
initial condition. 

Also in ifTSl . Schaal's suggested system is driven between 
stationary points. However, biological experiments suggest 
that we are faced with a "virtual trajectory" composed of 
equilibrium points that has velocity components. For this 
reason, we showed that we can achieve this property by 
combining nonconstant points. 

IV. Experiments on Helicopter 
Here, we apply the motion primitives on the helicopter. 

A. Experimental Setup 

We used Quanser Helicopter (see Figure |2|l in our exper- 
iments. The helicopter is an under-actuated system having 
two propellers at the end of the arm. Two DC motors are 
mounted below the propellers to create the forces which 
drive propellers. The motors' axes are parallel and their 
thrust is vertical to the propellers. We have three degrees of 
freedom (DOF): pitch (vertical movement of the propellers), 
roll (circular movement around the axis of the propellers) and 
travel (movement around the vertical base) in contrast with 
conventional helicopters with six degrees of freedom. 




Fig. 2. Transverse momentum distributions. (9) 

In system model||9l, the origin of our coordinate system is 
at the bearing and slip-ring assembly. The combinations of 
actuators form the collective {Tcoi — Tj^ + Tfi) and cyclic 
{Tcyc — Tl — Tff) forces which are used as inputs in our 
controller. The schematics of helicopter are shown in Figures 
ElandE] 

Let Jxx, Jyy, and J^^ denote the moment of inertia of 
our system dynamics. For simplicity, we ignore the products 




Fig. 3. Schematic of the 3DOF Fig. 4. Top view. (9l 

helicopter.[9 1 



of inertia terms. The equations of motion are as follows (cf. 
Ishutkina M)- 

M^iTL+TR)Lcosie) siii(0), 

-{Tl - TR)lh sin(0) sin(0) - Drag, 

Jyy'e = -Mgle sin(0 + ^o) + {Tl + Tr)L cos{(j)), 

Jxx4> = -mgl^sm{(j)) + (Tl - T]i)lh, 

where 

• M is the total mass of the heUcopter assembly, 

• m is the mass of the rotor assembly, 

• L is the length of the main beam from the slip-ring pivot 
to the rotor assembly, 

• tp, 9, (p are travel, pitch and roll angles respectively. 

• Ifi is the distance from the rotor pivot to each of the 
propellers, 

. Drag = ip(^L)2(5o + sin(0))L, 

• 5*0 and Sq are the effective drag coefficients times the 
reference area and p is the density of air 

It can be seen that the above system is nonlinear in the 
states, but linear in terms of control inputs. In practice, we 
used feedback linearization with bounded internal dynamics 
(see Bayraktar [12]) for a 3DOF helicopter, which tracks 
trajectories in elevation and travel. 

B. Simulation & Experimental Results 

In this section, we first describe our numerical simulation 
of the proposed primitive framework. Second, we describe our 
actual experiment on the Quanser Helicopter 

1) Trajectory Generation: In experimental setup, we used 
an operator with a joystick to create aggressive trajectories to 
pass an obstacle. However, generating aggressive trajectories 
with the joystick is a difficult task even for the operator. 
Therefore, we designed an augmented control for the joystick 
to enhance the performance of the helicopter. In detail, we 
used "up" and "down" movements of the joystick to increase 
or decrease the Tcoi that is apphed to the actuators. For the 
"right" and "left" movements of the joystick, we preferred to 
control the roll angle using PD control. 

In the original maneuver, the obstacle's distance and the 
highest point are in the coordinates where ijj and 9 angles 
are 220 and 60 respectively and the helicopter stops at the 
coordinates where ip = 28, 9 — 317 and (p = —17 (see Figure 

E). 
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Fig. 5. Original maneuver acliieved by an operator 



Fig. 7. Trajectory generation for tlie second primitive - pitcli 



2) Trajectory Learning: From several demonstrations, it is 
observed that our operator follows two distinct pattern to carry 
out the maneuver. Accordingly, these two patterns suggest an 
equilibrium point at the top of the obstacle. Therefore, to fly 
over different obstacles, the acquired primitive is segmented 
into two primitives at the highest pitch angle. Fig. |6]and Fig. 
|7] show the results of DMP algorithm for the pitch angle. The 
top left graphs are results for pitch angles, where green lines 
represent the operator input for the trajectories and blue lines 
represent the fittings that the DMP computes for different start 
and end points. Hence, desired trajectories in these graphs are 
not on top of the trajectories generated by the operator Other 
graphs show the time evolution of the DMP parameters. 
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Fig. 6. Trajectory generation for the first primitive - pitch 

3) Synchronization of primitives: The two primitives cre- 
ated in the previous sections are defined as trajectories between 
certain start and end points. However, the end point of the first 



trajectory does not necessarily matches with the starting point 
of the second trajectory. We use partial contraction theory [131 
to force the first trajectory to converge to the second one. 
However, since we want to use the contraction as a transition 
between two trajectories, coupling is enabled towards the end 
of first primitive. Figure [8] shows how the two trajectories 
evolve in time. In the first primitive, the goal positions of 
and 6 angles are changed to 150° and 50° respectively, where 
original angles are ijj = 220° and — 60°. In the second 
primitive, the goal position of the ip angle is changed from 
317° to 300°. 




Fig. 8. Time evolution of primitive-1 and primitive-2 merged. 

4) Experiments on the Helicopter: Tracking performance 
of the helicopter is shown in Figure |9] It is seen that the 
helicopter followed the desired (ij) and 9) angles almost 
perfectly. However, the trajectory of the roll angle is a bit 



different than the desired since we control two parameters (tp 
and 9) and the goal positions of the DMPs are different. But 
we should highlight the fact that two roll trajectories follow 
the same pattern. In figure, the last part of the roll trajectory 
manifests an oscillation which can be prevented by roll control, 
since the other parameters are almost constant. The tracking 
performance can further be improved by applying discrete 
nonlinear observers to get better velocity and acceleration 
values. Figure [To] shows snapshots of the maneuver 




Fig. 9. Tracking performance of the helicopter. 
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Fig. 10. Snapshots of the obstacle avoidance maneuver. 

V. Extensions of DMP 

A. Dynamical System with First-Order Filters 

DMP algorithm can be improved by replacing the first 
system with the equations shown below: 

Ty + aiy (25) 
Tx + a2X = g + f (26) 

which is equivalent of 

y + T {ai + a2) y + aia2 y ^ g + f (27) 



By introducing two first-order filters, we guarantee the stability 
of the system against time varying parameters like r(i) or g{t) 
. Since the system is linear without the /-function (EqfTTTl. we 
achieve learning and modulation properties of DMP using the 
/ in either Eq.(l25]l or Eq.(l26]l. For further applications, we 
will use this model to generate primitives for time-varying 
goal points. 

B. Generating New Primitives 

Experiments on frog's spinal cord lISTl . ll26l . ll47l sug- 
gest that movement primitives can be generated from linear 
combinations of vectorial force fields which lead the limb 
of a frog to the virtual equilibrium points. In ll47l . it is 
also pointed out that vectorial summation of two force fields 
with different equilibrium points generate a new force field 
whose equilibrium point is at intermediate location of the 
original equilibrium points. In this perspective, we will use 
two methods to generate new primitives. 

1) Two-way Synchronization of DMPs: Consider a system 

yi^f{yi,t)+K{u{y2)~u{y,)) (28) 
y2 = f{y2,t) + K{u{yi) ~ u{y2)) K>0 (29) 

Where yi and 1/2 represent the first and the second primitive 
respectively. From partial contraction theory, we say that 
yi and 2/2 converge together exponentially, if / — 2Ku is 
contracting. Since DMPs are aheady contracting, we achieve 
synchronization using contracting inputs. In FiglTT](top), new 
primitive is a linear combination of sine and cosine primi- 
tives. Also in the same figure, coupling forces accounts for 
oscillations before synchronization happens. 




Fig. 11. Top:Synchronization of sine and cosine primitives. Bottom: New 
primitive generated by the linear combinations of weights from sine and cosine 
primitives 

2) Combination of Primitives using Weights: In DMPs, as it 
was shown before, system behaves linearly and superposition 
applies. Therefore, in the /-function , linear combination of 
the weights from different primitives produce linear combi- 
nation of primitives. For rhythmic DMPs, as an example. 



we combine the weights of the sine and cosine primitives 
{■Wnew = 0.5wsine + 0.5wcosine) to generate a new primitive 
(See Fig. [TT| (bottom)). However for a regular DMP, we 
can not achieve the desired trajectories although we have 
linearity which is because input "g" point is not compatible 
with the weights changing with respect to the couplings. For 
this reason, we will simply modify the equations in our later 
research. 

VI. Conclusion 

In this paper, we use a novel approach, inspired by biolog- 
ical experiments and humanoid robotics, which uses control 
primitives to imitate the data taken from human-performed ob- 
stacle avoidance maneuver In our model, DMP computes the 
trajectory dynamics so that we can generate complex primitive 
trajectories for given different start and end points, while one- 
way coupling ensures smooth transitions between primitives 
at the equilibrium points. We demonstrate our algorithm with 
an experiment. We generate a complex, aggressive maneuver, 
which our helicopter could follow within a given error bound 
with a desired speed. Future research will be conducted on 
different combinations of primitives using partial contraction 
theory. We expect these techniques to be particularly useful 
when the system dynamic models are very coarse, as e.g. 
in the case of flapping wing systems and new bio-inspired 
underwater vehicles. 
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